Supplement to: Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression
نویسندگان
چکیده
For the DLBCL data, we not only list the selected genes, but also attempt to find any discussion of those genes in existing literature. Our final estimated model uses 49 gene features, which correspond to 26 genes. To examine the relevance of each selected gene for DLBCL, we adopt two approaches. The first endeavors to find literature examining the biological connection of the identified gene to any type of lymphoma. The second lists any reference in the (rather lengthy) methodological literature in statistics, computer science, and bioinformatics that uses statistical or machine learning methods to examine the DLBCL dataset. We display our findings for all 26 genes in Table 1. To summarize, 16 out of the 26 genes have been related to lymphoma in the biological literature, and 19 of them have already been identified via statistical techniques developed for the DLBCL dataset. While many of the 26 genes have been previously connected to lymphoma in general and DLBCL in particular, AIMER does identify 4 genes with symbols ALDH2, CELF2, COL16A1, and DHRS9 that have not been previously identified in the biological or methodological literature. We note that, while we have made every effort to locate each gene, given the large and evolving literature on this topic, those we have been unable to locate may have none-the-less been previously studied.
منابع مشابه
Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression
Motivation The discovery of relationships between gene expression measurements and phenotypic responses is hampered by both computational and statistical impediments. Conventional statistical methods are less than ideal because they either fail to select relevant genes, predict poorly, ignore the unknown interaction structure between genes, or are computationally intractable. Thus, the creation...
متن کاملUse of gene expression data for predicting continuous phenotypes for animal production and breeding.
Traits such as disease resistance are costly to evaluate and slow to improve using current methods. Analysis of gene expression profiles (e.g. DNA microarrays) has potential for predicting such phenotypes and has been used in an analogous way to classify cancer types in human patients. However, doubts have been raised regarding the use of classification methods with microarray data for this pur...
متن کاملDiagnosis of the disease using an ant colony gene selection method based on information gain ratio using fuzzy rough sets
With the advancement of metagenome data mining science has become focused on microarrays. Microarrays are datasets with a large number of genes that are usually irrelevant to the output class; hence, the process of gene selection or feature selection is essential. So, it follows that you can remove redundant genes and increase the speed and accuracy of classification. After applying the gene se...
متن کاملMetabolic syndrome and different obesity phenotypes in the elderly women population: Iran’s Health System on aging
Background: Current literature has been focused on types of obesity with normal BMI (body mass index), but metabolically unhealthy.This study evaluates the prevalence of metabolical phenotypes of obesity. We also identified the best obesity index in predicting the components of metabolic syndrome (MetS). Methods: A cross-sectional study has been conducted on 164 women over 60 years. Anthropome...
متن کاملPredicting The Type of Malaria Using Classification and Regression Decision Trees
Predicting The Type of Malaria Using Classification and Regression Decision Trees Maryam Ashoori1 *, Fatemeh Hamzavi2 1School of Technical and Engineering, Higher Educational Complex of Saravan, Saravan, Iran 2School of Agriculture, Higher Educational Complex of Saravan, Saravan, Iran Abstract Background: Malaria is an infectious disease infecting 200 - 300 million people annually. Environme...
متن کامل